| DataSet | GRT low | GRT high | Distance Threshold | Proximity Criterion | Deers | Observations |
|---|---|---|---|---|---|---|
| 1 | 0 | 36 | 10 | closest in time | 35 | 149 |
| 2 | 0 | 36 | 10 | nearest | 35 | 147 |
| 3 | 0 | 200 | 15 | score | 36 | 223 |
P15.2 Fortgeschrittenes Praxisprojekt
Dr. Nicolas Ferry - Bavarian National Forest Park / Daniel Schlichting - StabLab
31 Jan 2025
Model FCM levels - amongst other covariates - on spatial and temporal distance to hunting activities
Expectations:
Contains information of 809 faecal samples, including:
Samples where taken at irregular time intervals from 2020 to 2022.
Other sources of uncertainty include:
lack of information about hunting events (single time points as start, end, middle?)
unknown characteristics of the deer (e.g., age, health, etc.),
other unknown stressors (e.g., predators, human activities, weather, etc.),
unknown geographical features (e.g., terrain could affect the propagation of sound).
Deer location at the time of hunting event is approximated by linear interpolation:
A hunting event is considered relevant to a faecal sample, if
Among the relevant hunting events, the most relevant one is defined by one the three proximity criteria:
we define the Scoring function as following:
\[ S(d, t) \propto \begin{cases} \frac{1}{d^2} \cdot f_\textbf{t}(t), t \sim \mathcal{N}(\mu, \sigma^2) &|t \leq \mu \\ \frac{1}{d^2} \cdot f_\textbf{t}(t), t \sim \mathcal{Laplace}(\mu, b) &|t > \mu \end{cases} \] where:
\[ \begin{align*} d & \text{: Distance } \\ t & \text{: Time Difference } \\ \mu & \text{: GRT target = 19 hours } \end{align*} \]
The marginal effects of distance and elapsed time since challenge on the score:
We suggest three different Datasets for Modelling
| DataSet | GRT low | GRT high | Distance Threshold | Proximity Criterion | Deers | Observations |
|---|---|---|---|---|---|---|
| 1 | 0 | 36 | 10 | closest in time | 35 | 149 |
| 2 | 0 | 36 | 10 | nearest | 35 | 147 |
| 3 | 0 | 200 | 15 | score | 36 | 223 |
For Modelling, we consider the following covariates, defined for each pair of FCM sample and most relevant hunting event:
We chose two different approaches to Modelling:
Family: Gamma
Let \(i = 1,\dots,N\) be the indices of deer and \(j = 1,\dots,n_i\) be the indices of faecal samples for each deer
\[ \begin{eqnarray} \textup{FCM}_{ij} &\overset{\mathrm{iid}}{\sim}& \mathcal{Ga}\left( \nu, \frac{\nu}{\mu_{ij}} \right) \quad\text{for}\; j = 1,\dots,n_i, \\ \mu_{ij} &=& \mathbb{E}(\textup{FCM}_{ij}) = \exp(\eta_{ij}), \\ \eta_{ij} &=& \beta_0 + \beta_1 \cdot \textup{number of other relevant hunting events}_{ij} + \\ && f_1(\textup{time difference}_{ij}) + f_2(\textup{distance}_{ij}) + \\ && f_3(\textup{sample delay}_{ij}) + f_4(\textup{defecation day}_{ij}) + \\ && \gamma_{i}, \\ \gamma_i &\overset{\mathrm{iid}}{\sim}& \mathcal{N}(0, \sigma_\gamma^2) \end{eqnarray} \]
\(f_1, f_2, f_3, f_4\) are penalised cubic regression splines.
High uncertainty
about all estimated effects,
across all datasets.
Instability with respect to estimation methods.
Estimation of random intercepts is sensitive to choice of dataset.
Consistent pattern of sample delay effect
| Model | Objective | Evaluation Metric | Max Depth | Eta | Gamma | Subsample | Colsample Bytree | Min Child Weight | Mean RMSE | SD RMSE | Number of Observations |
|---|---|---|---|---|---|---|---|---|---|---|---|
| last | reg:squarederror | rmse | 4 | 0.1635 | 5.850 | 0.5918 | 0.9921 | 4.640 | 168.6336 | 24.40957 | 149 |
| nearest | reg:squarederror | rmse | 4 | 0.1661 | 5.893 | 0.5956 | 0.9832 | 4.747 | 151.3186 | 17.91780 | 147 |
| score | reg:squarederror | rmse | 5 | 0.1744 | 5.834 | 0.6063 | 1.0000 | 4.766 | 147.9845 | 16.50250 | 223 |
We do this seperately for all 3 datasets (nearest, closest and score).
Effect of Hunting on Red Deer